Queries, the Missing Link in Automatic Data Integration
نویسندگان
چکیده
This paper introduces the ontology mapping approach of a system that automatically integrates data sources into an ontology-based data integration system (OBDI). In addition to the target and source ontologies, the mapping algorithm requires a SPARQL query to determine the ontology mapping. Further, the mapping algorithm is dynamic: running each time a query is processed and producing only a partial mapping sufficient to reformulate the query. This approach enables the mapping algorithm to exploit query semantics to correctly choose among ontology mappings that are indistinguishable when only the ontologies are considered. Also, the mapping associates paths with paths, instead of entities with entities. This approach simplifies query reformulation. The system achieves favorable results when compared to the algorithms developed for Clio, the best automated relational data integration system. We have developed an Ontology-based Data Integration (OBDI) system that departs from the conventional OBDI organization. The goal is to include automatic integration of new data sources, provided those data sources publish a self-describing ontology. A consequence of that goal is there is no longer the opportunity for an engineer to review and correct an ontology matching prior to its use by the query reformulation system. As ontology matching is understood to be an uncertain process, some other method of mapping refinement is needed. Our system uses queries for this purpose. Ontology mapping in conventional OBDI systems is determined prior to, and without knowledge of the queries to be executed [3]. A static representation of a mapping between target and source ontologies serves as input to a query reformulation module (Fig. 1(a)). In the system described here, ontology mapping is a dynamically computed component whose result depends on the query that is being processed (Fig. 1(b)). In effect, the query becomes a third argument to the ontology mapping algorithm. The query provides context for selecting among competing mappings. Since a mapping is specific to a query, the results may be limited to the partial mapping required by the query reformulation system. The organization was motivated by the following observations. A mapping method may determine that an entity in one ontology maps with equal likelihood to two or more entities in the other ontology. The mapping and reformulation of certain queries is correct only if one pairing is chosen. The correct choice may be different for different queries. The query itself may lend additional semantics that correctly resolve the ambiguity. These observations are supported by the example in Fig. 2. Looking at the ontologies alone, there is insufficient information to determine if the class T :People should Ontology Mapping Query Reformulation Ontology T Query q Ontology S Reformulated query (a) Traditional Ontology Mapping Query Reformulation Ontology T Query q Ontology S Reformulated query
منابع مشابه
Integration of Visible Image and LIDAR Altimetric Data for Semi-Automatic Detection and Measuring the Boundari of Features
This paper presents a new method for detecting the features using LiDAR data and visible images. The proposed features detection algorithm has the lowest dependency on region and the type of sensor used for imaging, and about any input LiDAR and image data, including visible bands (red, green and blue) with high spatial resolution, identify features with acceptable accuracy. In the proposed app...
متن کاملProbabilistic Linkage of Persian Record with Missing Data
Extended Abstract. When the comprehensive information about a topic is scattered among two or more data sets, using only one of those data sets would lead to information loss available in other data sets. Hence, it is necessary to integrate scattered information to a comprehensive unique data set. On the other hand, sometimes we are interested in recognition of duplications in a data set. The i...
متن کاملSemantic Constraint and QoS-Aware Large-Scale Web Service Composition
Service-oriented architecture facilitates the running time of interactions by using business integration on the networks. Currently, web services are considered as the best option to provide Internet services. Due to an increasing number of Web users and the complexity of users’ queries, simple and atomic services are not able to meet the needs of users; and to provide complex services, it requ...
متن کاملPsychological Therapies: The Missing Link in Improving Treatment Adherence in Patients with β-thalassemia Major
متن کامل
بهبود الگوریتم انتخاب دید در پایگاه داده تحلیلی با استفاده از یافتن پرس وجوهای پرتکرار
A data warehouse is a source for storing historical data to support decision making. Usually analytic queries take much time. To solve response time problem it should be materialized some views to answer all queries in minimum response time. There are many solutions for view selection problems. The most appropriate solution for view selection is materializing frequent queries. Previously posed ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012